Polynucleotide associated with a type II diabetes mellitus comprising single nucleotide polymorphism, microarray and diagnostic kit comprising the same and method for analyzing polynucleotide using the same

ABSTRACT

Provided is a polynucleotide for diagnosis or treatment of type II diabetes mellitus, including at least 10 contiguous nucleotides of a nucleotide sequence selected from the group consisting of nucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotide at position 101 of the nucleotide sequence, or a complementary polynucleotide thereof.

1. FIELD OF THE INVENTION

The present invention relates to a polynucleotide associated with typeII diabetes mellitus, a microarray and a diagnostic kit including thesame, and a method of analyzing polynucleotides associated with type IIdiabetes mellitus.

2. DESCRIPTION OF THE RELATED ART

The genomes of all organisms undergo spontaneous mutation in the courseof their continuing evolution, generating variant forms of progenitornucleic acid sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 (1986)).The variant forms may confer an evolutionary advantage or disadvantage,relative to a progenitor form, or may be neutral. In some instances, avariant form confers a lethal disadvantage and is not transmitted tosubsequent generations of the organism. In other instances, a variantform confers an evolutionary advantage to the species and is eventuallyincorporated into the DNA of most members of the species and effectivelybecomes the progenitor form. In many instances, both progenitor andvariant form(s) survive and co-exist in a species population. Thecoexistence of multiple forms of a sequence gives rise to polymorphisms.

Several different types of polymorphisms have been known, includingrestriction fragment length polymorphisms (RFLPs), short tandem repeats(STRs), variable number tandem repeats (VNTRs) and single-nucleotidepolymorphisms (SNPs). Among them, SNPs take the form ofsingle-nucleotide variations between individuals of the same species.When SNPs occur in protein coding sequences, any one of the polymorphicforms may give rise to the expression of a defective or a variantprotein. On the other hand, when SNPs occur in non-coding sequences,some of these polymorphisms may result in the expression of defective orvariant proteins (e.g., as a result of defective splicing). Other SNPshave no phenotypic effects.

It is known that human SNPs occur at a frequency of 1 in about 1,000 bp.When such SNPs induce a phenotypic expression such as a disease,polynucleotides containing the SNPs can be used as primers or probes fordiagnosis of a disease. Monoclonal antibodies specifically binding withthe SNPs can also be used in diagnosis of a disease. Currently, researchinto the nucleotide sequences and functions of SNPs is under way by manyresearch institutes. The nucleotide sequences and other experimentalresults of the identified human SNPs have been made into database to beeasily accessible.

Even though findings available to date show that specific SNPs exist onhuman genomes or cDNAs, phenotypic effects of such SNPs have not beenrevealed. Functions of most SNPs have not been disclosed yet except someSNPs.

It is known that 90-95% of total diabetes patients suffer type IIdiabetes mellitus. Type II diabetes mellitus is a disorder which isdeveloped in persons who abnormally produce insulin or have lowsensitivity to insulin, thereby resulting in large change in bloodglucose level. When disorder of insulin secretion leads to the conditionof type II diabetes mellitus, blood glucose cannot be transferred tobody cells, which renders the conversion of food into energy difficult.It is known that a genetic cause has a role in type II diabetesmellitus. Other risk factors of type II diabetes mellitus are age over45, familial history of diabetes mellitus, obesity, hypertension, andhigh cholesterol level. Currently, diagnosis of diabetes mellitus ismainly made by measuring a pathological phenotypic change, i.e., bloodglucose level, using fasting blood glucose (FSB) test, oral glucosetolerance test (OGTT), and the like [National Institute of Diabetes andDigestive and Kidney Diseases of the National Institutes of Health,2003]. When diagnosis of type II diabetes mellitus is made, type IIdiabetes mellitus can be prevented or its onset can be delayed byexercise, special diet, body weight control, drug therapy, and the like.In this regard, it can be said that type II diabetes mellitus is adisease in which early diagnosis is highly desirable. MilleniumPharmaceuticals Inc. reported that diagnosis and prognosis of type IIdiabetes mellitus can be made based on genotypic variations present onHNF1 gene [PR newswire, Sep. 1, 1998]. Sequenom Inc. reported that FOXA2(HNF3β) gene is highly associated with type II diabetes mellitus [PRNewswire, Oct. 28, 2003]. Even though there are reports about some genesassociated with type II diabetes mellitus, researches into the incidenceof type II diabetes mellitus have been focused on specific genes of somechromosomes in specific populations. For this reason, research resultsmay vary according to human species. Furthermore, all causative genesresponsible for type II diabetes mellitus have not yet been identified.Diagnosis of type II diabetes mellitus by such a molecular biologicaltechnique is now uncommon. In addition, early diagnosis before incidenceof type II diabetes mellitus is currently unavailable. Therefore, thereis an increasing need to find new SNPs highly associated with type IIdiabetes mellitus and related genes that are found in whole humangenomes and to make early diagnosis of type II diabetes mellitus usingthe SNPs and the related genes.

SUMMARY OF THE INVENTION

The present invention provides a polynucleotide containingsingle-nucleotide polymorphism associated with type II diabetesmellitus.

The present invention also provides a microarray and a type II diabetesmellitus diagnostic kit, each of which includes the polynucleotidecontaining single-nucleotide polymorphism associated with type IIdiabetes mellitus.

The present invention also provides a method of analyzingpolynucleotides associated with type II diabetes mellitus.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides a polynucleotide for diagnosis ortreatment of type II diabetes mellitus, including at least 10 contiguousnucleotides of a nucleotide sequence selected from the group consistingof nucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotideof a polymorphic site (position 101) of the nucleotide sequence, or acomplementary polynucleotide thereof.

The polynucleotide includes a contiguous span of at least 10 nucleotidescontaining the polymorphic site of a nucleotide sequence selected fromthe nucleotide sequences of SEQ ID NOS: 1-80. The polynucleotide is 10to 400 nucleotides in length, preferably 10 to 100 nucleotides inlength, and more preferably 10 to 50 nucleotides in length. Here, thepolymorphic site of each nucleotide sequence of SEQ ID NOS: 1-80 is atposition 101.

Each of the nucleotide sequences of SEQ ID NOS: 1-80 is a polymorphicsequence. The polymorphic sequence refers to a nucleotide sequencecontaining a polymorphic site at which single-nucleotide polymorphism(SNP) occurs. The polymorphic site refers to a position of a polymorphicsequence at which SNP occurs. The nucleotide sequences may be DNAs orRNAs.

In the present invention, the polymorphic sites (position 101) of thepolymorphic sequences of SEQ ID NOS: 1-80 are associated with type IIdiabetes mellitus. This is confirmed by DNA sequence analysis of bloodsamples derived from type II diabetes mellitus patients and normalpersons. Association of the polymorphic sequences of SEQ ID NOS: 1-80with type II diabetes mellitus and characteristics of the polymorphicsequences are summarized in Tables 1-1, 1-2, 2-1, and 2-2.

TABLE 1-1 SNP sequence (SEQ ID Allele frequency Genotype frequencyASSAY_ID SNP NO.) cas_A2 con_A2 Delta cas_A1A1 cas_A1A2 cas_A2A2 DMX_001C→T 1 and 2 0.592 0.492 0.1 54 136 109 DMX_003 A→G 3 and 4 0.292 0.2020.09 157 108 33 DMX_005 A→G 5 and 6 0.871 0.913 0.042 3 71 224 DMX_008G→A 7 and 8 0.218 0.158 0.06 180 103 13 DMX_009 T→G  9 and 10 0.6640.737 0.073 31 138 129 DMX_011 A→G 11 and 12 0.866. 0.931 0.065 7 66 225DMX_012 C→G 13 and 14 0.527 0.614 0.087 72 140 88 DMX_014 A→C 15 and 160.903 0.837 0.066 0 58 240 DMX_016 A→G 17 and 18 0.275 0.209 0.066 158116 24 DMX_019 T→C 19 and 20 0.961 0.924 0.037 1 21 275 DMX_027 G→C 21and 22 0.844 0.89 0.046 3 87 208 DMX_028 T→C 23 and 24 0.945 0.977 0.0320 33 266 DMX_029 C→A 25 and 26 0.057 0.104 0.047 268 28 3 DMX_030 C→T 27and 28 0.077 0.129 0.052 251 41 2 DMX_031 A→T 29 and 30 0.916 0.86 0.0562 46 251 DMX_032 T→A 31 and 32 0.718 0.593 0.125 26 117 157 DMX_033 T→C33 and 34 0.816 0.9 0.084 10 89 198 DMX_044 A→T 35 and 36 0.846 0.7870.059 7 78 213 DMX_049 T→A 37 and 38 0.107 0.06 0.047 236 60 2 DMX_052C→T 39 and 40 0.94 0.907 0.033 3 29 261 df = 2 Genotype frequencyChi_exact_p- ASSAY_ID con_A1A1 con_A1A2 con_A2A2 Chi_value Value DMX_00177 151 72 12.384 2.05E−03 DMX_003 190 97 12 13.527 1.16E−03 DMX_005 4 44251 8.015 1.82E−02 DMX_008 205 80 6 7.051 2.94E−02 DMX_009 19 119 1617.814 2.01E−02 DMX_011 1 39 258 13.698 1.06E−03 DMX_012 44 139 111 9.3619.28E−03 DMX_014 10 77 211 14.539 6.97E−04 DMX_016 182 95 13 6.9473.10E−02 DMX_019 2 41 254 7.619 2.22E−02 DMX_027 3 60 237 6.842 3.27E−02DMX_028 0 14 284 8.268 5.72E−03 DMX_029 241 52 5 9.131 1.04E−02 DMX_030221 70 3 9.683 7.89E−03 DMX_031 6 72 221 9.636 8.08E−03 DMX_032 51 142107 20 4.54E−05 DMX_033 4 51 239 16.718 2.34E−04 DMX_044 15 93 181 6.6873.53E−02 DMX_049 263 36 0 9.459 8.83E−03 DMX_052 1 52 238 8.584 1.37E−02Odds ratio (OR): multiple model Risk HWE Sample call rate allele OR CIcon_HW cas_HW cas_call_rate con_call_rate A2 T 1.49 (1.193, 1.887) .027,HWE 1.195, HWE 1 1 A2 G 1.61 (1.245, 2.123) .01, HWE 4.819, HWE 0.99 1A1 A 1.56 (1.074, 2.259) 2.208, HWE .646, HWE 0.99 1 A2 A 1.49 (1.104,1.996) .265, HWE .167, HWE 0.99 0.97 A1 T 1.42 (1.106, 1.82) .195, HWE.424, HWE 0.99 1 A1 A 2.10 (1.414, 3.115) .026, HWE .948, HWE 0.99 0.99A1 C 1.43 (1.135, 1.8) .032, HWE 1.218, HWE 1 0.98 A2 C 1.82 (1.274,2.551) 1.527, HWE 1.834, HWE 0.99 0.99 A2 G 1.45 (1.1, 1.883) .089, HWE.241, HWE 0.99 0.97 A2 C 2.04 (1.215, 3.413) 1.004, HWE .549, HWE 0.990.99 A1 G 1.50 (1.067, 2.098) .069, HWE 3.4, HWE 0.99 1 A1 T 2.43(1.286, 4.585) .077, HWE .133, HWE 1 0.99 A1 C 1.93 (1.247, 2.975)1.514, HWE HWD 1 0.99 A1 C 1.79 (1.215, 2.64) .51, HWE 1.004, HWE 0.980.98 A2 T 1.79 (1.238, 2.591) .214, HWE .004, HWE 1 1 A2 A 1.75 (1.374,2.227) .148, HWE .582, HWE 1 1 A1 T 2.02 (1.434, 2.831) 2.023, HWE .005,HWE 0.99 0.98 A2 T 1.47 (1.099, 1.996) .452, HWE .013, HWE 0.99 0.96 A2G 1.89 (1.227, 2.874) .527, HWE .623, HWE 0.99 1 A2 T 1.61 (1.035,2.506) .688, HWE 4.52, HWE 0.98 0.97

TABLE 1-2 SNP sequence (SEQ ID Allele frequency Genotype frequencyASSAY_ID SNP NO.) cas_A2 con_A2 Delta cas_A1A1 cas_A1A2 cas_A2A2 DMX_054C→A 41 and 42 0.14 0.09 0.05 222 70 7 DMX_056 A→G 43 and 44 0.362 0.2730.089 123 137 40 DMX_060 G→A 45 and 46 0.957 0.925 0.032 0 25 267DMX_061 T→C 47 and 48 0.758 0.81 0.052 11 121 164 DMX_062 C→T 49 and 500.421 0.508 0.087 106 133 59 DMX_063 G→C 51 and 52 0.902 0.953 0.051 255 243 DMX_065 C→T 53 and 54 0.92 0.958 0.038 4 39 250 DMX_067 G→A 55and 56 0.903 0.941 0.038 2 54 243 DMX_068 A→G 57 and 58 0.081 0.1330.052 252 42 3 DMX_069 T→C 59 and 60 0.44 0.498 0.058 96 143 60 DMX_104T→C 61 and 62 0.274 0.204 0.07 158 115 24 DMX_105 A→C 63 and 64 0.7690.838 0.069 19 100 180 DMX_116 T→C 65 and 66 0.6 0.668 0.068 41 157 101DMX_117 T→C 67 and 68 0.188 0.251 0.063 199 89 12 DMX_120 A→G 69 and 700.818 0.871 0.053 7 95 197 DMX_136 T→C 71 and 72 0.211 0.263 0.052 18896 15 DMX_139 A→G 73 and 74 0.17 0.105 0.065 205 88 7 DMX_150 A→G 75 and76 0.926 0.958 0.032 0 44 252 DMX_152 A→C 77 and 78 0.562 0.64 0.078 62136 99 DMX_154 A→G 79 and 80 0.269 0.199 0.07 153 131 15 df = 2 Genotypefrequency Chi_exact_p- ASSAY_ID con_A1A1 con_A1A2 con_A2A2 Chi_valueValue DMX_054 248 48 3 7.14 2.82E−02 DMX_056 160 116 24 10.581 5.04E−03DMX_060 3 39 257 6.171 4.57E−02 DMX_061 13 87 198 8.911 1.16E−02 DMX_06272 146 77 9.468 8.79E−03 DMX_063 1 26 272 12.347 2.08E−03 DMX_065 0 24263 7.84 1.98E−02 DMX_067 2 31 264 7.087 2.89E−02 DMX_068 227 59 107.934 1.89E−02 DMX_069 66 164 65 7.165 2.78E−02 DMX_104 184 95 12 7.8212.00E−02 DMX_105 9 79 212 8.646 1.33E−02 DMX_116 29 139 129 6.5543.77E−02 DMX_117 171 103 23 6.582 3.72E−02 DMX_120 3 70 222 6.8533.25E−02 DMX_136 156 123 16 6.311 4.26E−02 DMX_139 239 59 2 11.1023.88E−03 DMX_150 0 25 270 5.851 2.06E−02 DMX_152 41 129 123 7.0342.97E−02 DMX_154 187 100 9 9.045 1.09E−02 Odds ratio (OR): multiplemodel Risk HWE Sample call rate allele OR CI con_HW cas_HW cas_call_ratecon_call_rate A2 A 1.64 (1.145, 2.364) .504, HWE .819, HWE 1 1 A2 G 1.52(1.179, 1.923) .283, HWE .041, HWE 1 1 A2 A 1.82 (1.1, 3.012) 4.113, HWE.042, HWE 0.97 1 A1 T 1.36 (1.031, 1.798) 1.122, HWE 3.894, HWE 0.990.99 A1 C 1.42 (1.131, 1.788) .034, HWE 2.43, HWE 0.99 0.98 A1 G 2.22(1.395, 3.534) .504, HWE .08, HWE 1 1 A1 C 2.00 (1.205, 3.314) .043, HWE9.409, HWE 0.98 0.96 A1 G 1.72 (1.109, 2.653) 1.047, HWE .077, HWE 10.99 A1 A 1.75 (1.2, 2.557) 6.304, HWE 4.107, HWE 0.99 0.99 A1 T 1.27(1.007, 1.59) 3.708, HWE .364, HWE 1 0.98 A2 C 1.47 (1.122, 1.927) .011,HWE .284, HWE 0.99 0.97 A1 A 1.56 (1.165, 2.077) .64, HWE 1.497, HWE 1 1A1 T 1.34 (1.059, 1.7) .838, HWE 2.473, HWE 1 0.99 A1 T 1.44 (1.095,1.902) 2.116, HWE .464, HWE 1 0.99 A1 A 1.51 (1.097, 2.072) .497, HWE.894, HWE 1 0.98 A1 T 1.33 (1.02, 1.746) 1.611, HWE .42, HWE 1 0.98 A2 G1.75 (1.247, 2.445) .498, HWE .32, HWE 1 1 A1 A 1.81 (1.095, 3.006).174, HWE .654, HWE 0.99 0.98 A1 A 1.38 (1.095, 1.748) .774, HWE 1.715,HWE 0.99 0.98 A2 G 1.47 (1.129, 1.942) .768, HWE 3.616, HWE 1 0.99

TABLE 2-1 SNP sequence Chromosome ASSAY_ID rs SNP site (SEQ ID NO)Chromosome # position Band Gene DMX_001 rs502612 C→T 1 and 2 1 1673734611q24.2 PRRX1 DMX_003 rs1483 A→G 3 and 4 1 223672376 1q42.13 CDC42BPADMX_005 rs632585 A→G 5 and 6 1 228802209 1q42.2 between genes DMX_008rs177560 G→A 7 and 8 11 16911751 11p15.1 between genes DMX_009 rs1394720T→G  9 and 10 11 4533242 11p15.4 between genes DMX_011 rs488115 A→G 11and 12 11 74409538 11q13.4 between genes DMX_012 rs2063728 C→G 13 and 1411 77863284 11q14.1 FLJ23441 DMX_014 rs725834 A→C 15 and 16 13 9925485913q32.3 CLYBL DMX_016 rs767837 A→G 17 and 18 13 48218663 13q14.2 betweengenes DMX_019 rs929703 T→C 19 and 20 14 77691031 14q24.3 between genesDMX_027 rs739637 G→C 21 and 22 17 37534470 17q21.2 RAB5C DMX_028rs1990936 T→C 23 and 24 17 44307486 17q21.32 between genes DMX_029rs2051672 C→A 25 and 26 17 5847149 17p13.2 between genes DMX_030rs1038308 C→T 27 and 28 18 44538585 18q21.1 KIAA0427 DMX_031 rs655080A→T 29 and 30 18 57917416 18q21.33 PIGN DMX_032 rs1943317 T→A 31 and 3218 62419479 18q22.1 between genes DMX_033 rs929476 T→C 33 and 34 1933499519 19q12 between genes DMX_044 rs1984388 A→T 35 and 36 22 3065857522q12.3 between genes DMX_049 rs1707709 T→G 37 and 38 3 166922235 3q26.1between genes DMX_052 rs1786 C→T 39 and 40 4 15340722 4p15.33 betweengenes Description SNP function Amino acid change Remarks Paired relatedhomeobox 1 Intron No change CDC42 binding protein kinase alpha (DMPK45analogue) Intron No change — Between genes No change — Between genes Nochange — Between genes No change — Between genes No change Imaginaryprotein FLJ23441 Intron No change Citrate lyase beta analogue Intron Nochange — Between genes No change — Between genes No change RAB5C, RASoncogene family Intron No change — Between genes No change — Betweengenes No change KIAA0427 Coding-synon No change Phosphatidylinositolglycan, class N Intron No change — Between genes No change — Betweengenes No change — Between genes No change — Between genes No change —Between genes No change

TABLE 2-2 SNP sequence Chromosome ASSAY_ID rs SNP site (SEQ ID NO)Chromosome # position Band Gene DMX_054 rs872883 C→A 41 and 42 4 65826194p16.1 PPP2R2C DMX_056 rs752139 A→G 43 and 44 5 175943870 5q35.2 PC-LKCDMX_060 rs1769972 G→A 45 and 46 6 106782512 6q21 APG5L DMX_061 rs1322532T→C 47 and 48 6 19175693 6p22.3 Between genes DMX_062 rs2058501 C→T 49and 50 7 120274187 7q31.31 FLJ21986 DMX_063 rs1563047 G→C 51 and 52 7134030698 7q33 CALD1 DMX_065 rs38809 C→T 53 and 54 7 91792235 7q21.2PEX1 DMX_067 rs1054748 G→A 55 and 56 8 37837626 8p12 RAB11FIP DMX_068rs1434940 A→G 57 and 58 8 69660204 8q13.3 VEST1 DMX_069 rs1059033 T→C 59and 60 9 77736025 9q21.2 GNAQ DMX_104 rs492220 T→C 61 and 62 1 942545901p22.1 ABCA4 DMX_105 rs685328 A→C 63 and 64 10 117138050 10q25.3 ATRNL1DMX_116 rs1461986 T→C 65 and 66 13 75506683 13q22.2 Between genesDMX_117 rs1815620 T→C 67 and 68 14 50995615 14q22.1 Between genesDMX_120 rs293398 A→G 69 and 70 15 87459425 15q26.1 ABHD2 DMX_136rs1686492 T→C 71 and 72 2 10915411 2p25.1 Between genes DMX_139rs1237905 A→G 73 and 74 2 168278137 2q24.3 Between genes DMX_150rs589682 A→G 75 and 76 3 122172648 3q13.33 STXBP5L DMX_152 rs607209 A→C77 and 78 4 16808165 4p15.32 Between genes DMX_154 rs197367 A→G 79 and60 7 36219096 7p14.2 ANLN Amino acid Description SNP function changeRemarks Protein phosphatase 2 (former 2A), regulatory subunit B IntronNo change (PR 52), gamma isoform Protocadherin LKC Intron No change APG5autophagy 545 analogue (S. cerevisiae) Intron No change — Between genesNo change Imaginary protein FLJ21986 Intron No change Caldesmon 1 IntronNo change Peroxisome biogenesis factor 1 Intron No change RAB11 familyinteraction protein 1 (class I) No classified No change Between genes inVestibule-1 protein Intron No change NCBI bulid 119 Guanine nucleotidebinding protein (G protein), Intron No change q polypeptide ATP45;binding cassette, sub45; family A (ABC1), Intron No change member 4Attractin45 analogue 1 Intron No change KIAA0534 in NCBI — Between genesNo change build 119 — Between genes No change 2-containing abhydrolasedomain mma-utr No change — Between genes No change — Between genes Nochange Syntaxin binding protein 545 analogue Intron No change KIAA 1006in NCBI — Between genes No change build 119 Anillin, actin bindingprotein (scraps homolog, Drosophila) coding-nonsynon K→R

In Tables 1-1 and 1-2, the contents in columns are as defined below.

-   -   Assay_ID represents a marker name.    -   SNP is a polymorphic base of a SNP polymorphic site. Here, A1        and A2 represent respectively a low mass allele and a high mass        allele as a result of sequence analysis according a homogeneous        MassExtension (hME) technique (Sequenom) and are optionally        designated for convenience of experiments.    -   SNP sequence represents a sequence containing a SNP site, i.e.,        a sequence containing allele A1 or A2 at position 101.    -   At the allele frequency column, cas_A2, con_A2, and Delta        respectively represent allele A2 frequency of a case group,        allele A2 frequency of a normal group, and the absolute value of        the difference between cas_A2 and con_A2. Here, cas_A2 is        (genotype A2A2 frequency×2+genotype A1A2 frequency)/(the number        of samples×2) in the case group and con_A2 is (genotype A2A2        frequency×2+genotype A1A2 frequency)/(the number of samples×2)        in the normal group.    -   Genotype frequency represents the frequency of each genotype.        Here, cas_A1A1, cas_A1A2, and cas_A2A2 are the number of persons        with genotypes A1A1, A1A2, and A2A2, respectively, in the case        group, and con_A1A1, con_A1A2, and con_A2A2 are the number of        persons with genotypes A1A1, A1A2, and A2A2, respectively, in        the normal group.    -   df=2 represents a chi-squared value with two degree of freedom.        Chi-value represents a chi-squared value and p-value is        determined based on the chi-value. Chi_exact_p-value represents        p-value of Fisher's exact test of chi-square test. When the        number of genotypes is less than 5, results of the chi-square        test may be inaccurate. In this respect, determination of more        accurate statistical significance (p-value) by the Fisher's        exact test is required. The chi_exact_p-value is a variable used        in the Fisher's exact test. In the present invention, when the        p-value≦0.05, it is considered that the genotype of the case        group is different from that of the normal group, i.e., there is        a significant difference between the case group and the normal        group.    -   At the risk allele column, when a reference allele is A2 and the        allele A2 frequency of the case group is larger than the allele        A2 frequency of the normal group (i.e., cas_A2>con_A2), the        allele A2 is regarded as risk allele. In an opposite case,        allele A1 is regarded as risk allele.    -   Odds ratio represents the ratio of the probability of risk        allele in the case group to the probability of risk allele in        the normal group. In the present invention, the Mantel-Haenszel        odds ratio method was used. CI represents 95% confidence        interval for the odds ratio and is represented by (lower limit        of the confidence interval, upper limit of the confidence        interval). When 1 falls under the confidence interval, it is        considered that there is insignificant association of risk        allele with disease.    -   HWE represents Hardy-Weinberg Equilibrium. Here, con_HWE and        cas_HWE represent degree of deviation from the Hardy-Weinberg        Equilibrium in the normal group and the case group,        respectively. Based on chi_value=6.63 (p-value=0.01, df=1) in a        chi-square (df=1) test, a value larger than 6.63 was regarded as        Hardy-Weinberg Disequilibrium (HWD) and a value smaller than        6.63 was regarded as Hardy-Weinberg Equilibrium (HWE).    -   Call rate represents the number of genotype-interpretable        samples to the total number of samples used in experiments.        Here, cas_call_rate and con_call_rate represent the ratio of the        number of genotype-interpretable samples to the total number        (300 persons) of samples used in the case group and the normal        group, respectively.

Tables 2-1 and 2-2 present characteristics of SNP markers based on theNCBI build 123.

As shown in Tables 1-1, 1-2, 2-1, and 2-2, according to the chi-squaretest of the polymorphic markers of SEQ ID NOS: 1-80 of the presentinvention, chi_exact_p-value ranges from 4.54×10⁻⁴ to 0.0104 in 95%confidence interval. This shows that there are significant differencesbetween expected values and measured values in allele occurrencefrequencies in the polymorphic markers of SEQ ID NOS: 1-80. Odds ratioranges from 1.34 to 2.43, which shows that the polymorphic markers ofSEQ ID NOS: 1-80 are associated with type II diabetes mellitus.

The SNPs of SEQ ID NOS: 1-80 of the present invention occur at asignificant frequency in a type II diabetic patient group and a normalgroup. Therefore, the polynucleotide according to the present inventioncan be efficiently used in diagnosis, fingerprinting analysis, ortreatment of type II diabetes mellitus. In detail, the polynucleotide ofthe present invention can be used as a primer or a probe for diagnosisof type II diabetes mellitus. Furthermore, the polynucleotide of thepresent invention can be used as antisense DNA or a composition fortreatment of type II diabetes mellitus.

The present invention also provides an allele-specific polynucleotidefor diagnosis of type II diabetes mellitus, which is hybridized with apolynucleotide including a contiguous span of at least 10 nucleotidescontaining a nucleotide of a polymorphic site of a nucleotide sequenceselected from the group consisting of the nucleotide sequences of SEQ IDNOS: 1-80, or a complement thereof.

The allele-specific polynucleotide refers to a polynucleotidespecifically hybridized with each allele. That is, the allele-specificpolynucleotide has the ability that distinguishes nucleotides ofpolymorphic sites within the polymorphic sequences of SEQ ID NOS: 1-80and specifically hybridizes with each of the nucleotides. Thehybridization is performed under stringent conditions, for example,conditions of 1 M or less in salt concentration and 25° C. or more intemperature. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and 25-30° C. are suitable forallele-specific probe hybridization.

In the present invention, the allele-specific polynucleotide may be aprimer. As used herein, the term “primer” refers to a single strandedoligonucleotide that acts as a starting point of template-directed DNAsynthesis under appropriate conditions, for example in a buffercontaining four different nucleoside triphosphates and polymerase suchas DNA or RNA polymerase or reverse transcriptase and an appropriatetemperature. The appropriate length of the primer may vary according tothe purpose of use, generally 15 to 30 nucleotides. Generally, a shorterprimer molecule requires a lower temperature to form a stable hybridwith a template. A primer sequence is not necessarily completelycomplementary with a template but must be complementary enough tohybridize with the template. Preferably, the 3′ end of the primer isaligned with a nucleotide of each polymorphic site (position 101) of SEQID NOS: 1-80. The primer is hybridized with a target DNA containing apolymorphic site and starts an allelic amplification in which the primerexhibits complete homology with the target DNA. The primer is used inpair with a second primer hybridizing with an opposite strand. Amplifiedproducts are obtained by amplification using the two primers, whichmeans that there is a specific allelic form. The primer of the presentinvention includes a polynucleotide fragment used in a ligase chainreaction (LCR).

In the present invention, the allele-specific polynucleotide may be aprobe. As used herein, the term “probe” refers to a hybridization probe,that is, an oligonucleotide capable of sequence-specifically bindingwith a complementary strand of a nucleic acid. Such a probe may be apeptide nucleic acid as disclosed in Science 254, 1497-1500 (1991) byNielsen et al. The probe according to the present invention is anallele-specific probe. In this regard, when there are polymorphic sitesin nucleic acid fragments derived from two members of the same species,the probe is hybridized with DNA fragments derived from one member butis not hybridized with DNA fragments derived from the other member. Inthis case, hybridization conditions should be stringent enough to allowhybridization with only one allele by significant difference inhybridization strength between alleles. Preferably, the central portionof the probe, that is, position 7 for a 15 nucleotide probe, or position8 or 9 for a 16 nucleotide probe, is aligned with each polymorphic siteof the nucleotide sequences of SEQ ID NOS: 1-80. Therefore, there may becaused a significant difference in hybridization between alleles. Theprobe of the present invention can be used in diagnostic methods fordetecting alleles. The diagnostic methods include nucleic acidhybridization-based detection methods, e.g., southern blot. In a casewhere DNA chips are used for the nucleic acid hybridization-baseddetection methods, the probe may be provided as an immobilized form on asubstrate of a DNA chip.

The present invention also provides a microarray for diagnosis of typeII diabetes mellitus, including the polynucleotide according to thepresent invention or the complementary polynucleotide thereof. Thepolynucleotide of the microarray may be DNA or RNA. The microarray isthe same as a common microarray except that it includes thepolynucleotide of the present invention.

The present invention also provides a type II diabetes mellitusdiagnostic kit including the polynucleotide of the present invention.The type II diabetes mellitus diagnostic kit may include reagentsnecessary for polymerization, e.g., dNTPs, various polymerases, and acolorant, in addition to the polynucleotide according to the presentinvention.

The present invention also provides a method of diagnosing type IIdiabetes mellitus in an individual, which includes: isolating a nucleicacid sample from the individual; and determining a nucleotide of atleast one polymorphic site (position 101) within polynucleotides of SEQID NOS: 1-80 or complementary polynucleotides thereof. Here, when thenucleotide of the at least one polymorphic site of the sample nucleicacid is the same as at least one risk allele presented in Tables 1-1,1-2, 2-1, and 2-2, it may be determined that the individual has a higherlikelihood of being diagnosed as at risk of developing type II diabetesmellitus.

The step of isolating the nucleic acid sample from the individual may becarried out by a common DNA isolation method. For example, the nucleicacid sample can be obtained by amplifying a target nucleic acid bypolymerase chain reaction (PCR) followed by purification. In addition toPCR, there may be used LCR (Wu and Wallace, Genomics 4, 560 (1989),Landegren et al., Science 241, 1077 (1988)), transcription amplification(Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989)),self-sustained sequence replication (Guatelli et al., Proc. Natl. Acad.Sci. USA 87, 1874 (1990)), or nucleic acid sequence based amplification(NASBA). The last two methods are related with isothermal reaction basedon isothermal transcription and produce 30 or 100-fold RNA singlestrands and DNA double strands as amplification products.

According to an embodiment of the present invention, the step ofdetermining a nucleotide of a polymorphic site includes hybridizing thenucleic acid sample onto a microarray on which polynucleotides fordiagnosis or treatment of type II diabetes mellitus, including at least10 contiguous nucleotides derived from the group consisting ofnucleotide sequences of SEQ ID NOS: 1-80 and including a nucleotide of apolymorphic site (position 101), or complementary polynucleotidesthereof are immobilized; and detecting the hybridization result.

A microarray and a method of preparing a microarray by immobilizing aprobe polynucleotide on a substrate are well known in the pertinent art.Immobilization of a probe polynucleotide associated with type IIdiabetes mellitus of the present invention on a substrate can be easilyperformed using a conventional technique. Hybridization of nucleic acidson a microarray and detection of the hybridization result are also wellknown in the pertinent art. For example, the detection of thehybridization result can be performed by labeling a nucleic acid samplewith a labeling material generating a detectable signal, such as afluorescent material (e.g., Cy3 and Cy5), hybridizing the labelednucleic acid sample onto a microarray, and detecting a signal generatedfrom the labeling material.

According to another embodiment of the present invention, as a result ofthe determination of a nucleotide sequence of a polymorphic site, whenat least one nucleotide sequence selected from SEQ ID NOS: 2. 4, 5, 8,9, 11, 13, 16, 18, 20, 21, 23, 25, 27, 30, 32, 33, 36, 38, 40, 42, 44,46, 47, 49, 51, 53, 55, 57, 59, 62, 63, 65, 67, 69, 71, 75, 77, and 80containing risk alleles is detected, it may be determined that theindividual has a higher likelihood of being diagnosed as at risk ofdeveloping type II diabetes mellitus. If more nucleotide sequencescontaining risk alleles are detected in an individual, it may bedetermined that the individual has a much higher likelihood of beingdiagnosed as at risk of developing type II diabetes mellitus.

Hereinafter, the present invention will be described more specificallyby Example. However, the following Example is provided only forillustrations and thus the present invention is not limited thereto.

EXAMPLE Example 1

In this Example, DNA samples were extracted from blood streams of apatient group consisting of 300 Korean persons that had been identifiedas type II diabetes mellitus patients and had been being under treatmentand a normal group consisting of 300 persons free from symptoms of typeII diabetes mellitus and being of the same age with the patient group,and occurrence frequencies of specific SNPs were evaluated. The SNPswere selected from a known database (NCBI Single Nucleotide Polymorphismdata base or SEQUENOM RealSNP™ Assay Database). Primers hybridizing withsequences around the selected SNPs were used to assay the nucleotidesequences of SNPs in the DNA samples.

1. Preparation of DNA Samples

DNA samples were extracted from blood streams of type II diabetesmellitus patients and normal persons. DNA extraction was performedaccording to a known extraction method (Molecular cloning: A LaboratoryManual, p 392, Sambrook, Fritsch and Maniatis, 2nd edition, Cold SpringHarbor Press, 1989) and the specification of a commercial kitmanufactured by Centra system. Among extracted DNA samples, only DNAsamples having a purity (A₂₆₀/A₂₈₀ nm) of at least 1.7 were used.

2. Amplification of Target DNAs

Target DNAs, which are predetermined DNA regions containing SNPs to beanalyzed, were amplified by PCR. The PCR was performed by a commonmethod as the following conditions. First, 2.5 ng/ml of target genomicDNAs were prepared. Then, the following PCR mixture was prepared.

Water (HPLC grade) 2.24 μl 10x buffer (15 mM MgCl₂, 25 mM MgCl₂)  0.5 μldNTP Mix (GIBCO) (25 mM for each) 0.04 μl Taq pol (HotStar) (5 U/μl)0.02 μl Forward/reverse primer Mix (1 μM for each) 0.02 μl DNA 1.00 μlTotal volume 5.00 μl

Here, the forward and reverse primers were designed based on upstreamand downstream sequences of SNPs in known database. These primers arelisted in Table 3 below.

The thermal cycles of PCR were as follows: incubation at 95° C. for 15minutes; 45 cycles at 95° C. for 30 seconds, at 56° C. for 30 seconds,and at 72° C. for 1 minute; and incubation at 72° C. for 3 minutes andstorage at 4° C. As a result, amplified DNA fragments which were 200 orless nucleotides in length were obtained.

3. Analysis of SNPs in Amplified Target DNA Fragments

Analysis of SNPs in the amplified target DNA fragments was performedusing a homogeneous MassExtension (hME) technique available fromSequenom. The principle of the MassExtension technique was as follows.First, primers (also called as “extension primers”) ending immediatelybefore SNPs within the target DNA fragments were designed. Then, theprimers were hybridized with the target DNA fragments and DNApolymerization was performed. At this time, a polymerization solutioncontained a reagent (e.g., ddTTP) terminating the polymerizationimmediately after the incorporation of a nucleotide complementary to afirst allelic nucleotide (e.g., A allele). In this regard, when thefirst allele (e.g., A allele) exists in the target DNA fragments,products in which only a nucleotide (e.g., T nucleotide) complementaryto the first allele extended from the primers will be obtained. On theother hand, when a second allele (e.g., G allele) exists in the targetDNA fragments, a nucleotide (e.g., C nucleotide) complementary to thesecond allele is added to the 3′-ends of the primers and then theprimers are extended until a nucleotide complementary to the closestfirst allele nucleotide (e.g., T nucleotide) is added. The lengths ofproducts extended from the primers were determined by mass spectrometry.Therefore, alleles present in the target DNA fragments could beidentified. Illustrative experimental conditions were as follows.

First, unreacted dNTPs were removed from the PCR products. For this,1.53 μl of deionized water, 0.17 μl of HME buffer, and 0.30 μl of shrimpalkaline phosphatase (SAP) were added and mixed in 1.5 ml tubes toprepare SAP enzyme solutions. The tubes were centrifuged at 5,000 rpmfor 10 seconds. Thereafter, the PCR products were added to the SAPsolution tubes, sealed, incubated at 37° C. for 20 minutes and then 85°C. for 5 minutes, and stored at 4° C.

Next, homogeneous extension was performed using the amplified target DNAfragments as templates. The compositions of the reaction solutions forthe extension were as follows.

Water (nanoscale deionized water) 1.728 μl hME extension mix (10x buffercontaining 2.25 mM d/ddNTPs) 0.200 μl Extension primers (100 μM foreach) 0.054 μl Thermosequenase (32 U/μl) 0.018 μl Total volume  2.00 μl

The reaction solutions were thoroughly mixed with the previouslyprepared target DNA solutions and subjected to spin-down centrifugation.Tubes or plates containing the reaction solutions were compactly sealedand incubated at 94° C. for 2 minutes, followed by homogeneous extensionfor 40 cycles at 94° C. for 5 seconds, at 52° C. for 5 seconds, and at72° C. for 5 seconds, and storage at 4° C. The homogeneous extensionproducts thus obtained were washed with a resin (SpectroCLEAN).Extension primers used in the extension are listed in Table 3 below.

TABLE 3 Primers for amplification and extension primers for homogeneousextension for target DNAs Amplification primer (SEQ ID NO) ExtensionMarker Forward primer Reverse primer primer (SEQ ID NO) DMX_001 81 82 83DMX_003 84 85 86 DMX_005 87 88 89 DMX_008 90 91 92 DMX_009 93 94 95DMX_011 96 97 98 DMX_012 99 100 101 DMX_014 102 103 104 DMX_016 105 106107 DMX_019 108 109 110 DMX_027 111 112 113 DMX_028 114 115 116 DMX_029117 118 119 DMX_030 120 121 122 DMX_031 123 124 125 DMX_032 126 127 128DMX_033 129 130 131 DMX_044 132 133 134 DMX_049 135 136 137 DMX_052 138139 140 DMX_054 141 142 143 DMX_056 144 145 146 DMX_060 147 148 149DMX_061 150 151 152 DMX_062 153 154 155 DMX_063 156 157 158 DMX_065 159160 161 DMX_067 162 163 164 DMX_068 165 166 167 DMX_069 168 169 170DMX_104 171 172 173 DMX_105 174 175 176 DMX_116 177 178 179 DMX_117 180181 182 DMX_120 183 184 185 DMX_136 186 187 188 DMX_139 189 190 191DMX_150 192 193 194 DMX_152 195 196 197 DMX_154 198 199 200

Nucleotides of polymorphic sites in the extension products were assayedusing mass spectrometry, MALDI-TOF (Matrix Assisted Laser Desorption andIonization-Time of Flight). The MALDI-TOF is operated according to thefollowing principle. When an analyte is exposed to a laser beam, itflies toward a detector positioned at the opposite side in a vacuumstate, together with an ionized matrix. At this time, the time taken forthe analyte to reach the detector is calculated. A material with asmaller mass reaches the detector more rapidly. The nucleotides of SNPsin the target DNA fragments are determined based on a difference in massbetween the DNA fragments and known nucleotide sequences of the SNPs.

Determination results of the nucleotides of polymorphic sites of thetarget DNAs using the MALDI-TOF are shown in Tables 1-1, 1-2, 2-1, and2-2. Each allele may exist in the form of homozygote or heterozygote inan individual. However, the distribution between heterozygotes frequencyand homozygotes frequency in a given population does not exceed astatistically significant level. According to Mendel's Law ofinheritance and Hardy-Weinberg Law, a genetic makeup of allelesconstituting a population is maintained at a constant frequency. Whenthe genetic makeup is statistically significant, it can be considered tobe biologically meaningful. The SNPs according to the present inventionoccur in type II diabetes mellitus patients at a statisticallysignificant level, as shown in Tables 1-1, 1-2, 2-1, and 2-2, and thus,can be efficiently used in diagnosis of type II diabetes mellitus.

The polynucleotide according to the present invention can be used indiagnosis, treatment, or fingerprinting analysis of type II diabetesmellitus.

The microarray and diagnostic kit including the polynucleotide accordingto the present invention can be used for efficient diagnosis of type IIdiabetes mellitus.

The method of analyzing polynucleotides associated with type II diabetesmellitus according to the present invention can efficiently detect thepresence or a risk of type II diabetes mellitus.

1. A method for determining increased risk of developing type IIdiabetes mellitus in a human comprising: determining the nucleotide at apolymorphic site at position 101 of SEQ ID NO: 23 in a nucleic acidsample from the human is thymine (T), and determining the human has anincreased risk of developing type II diabetes mellitus.
 2. The method ofclaim 1, wherein determining the nucleotide of the polymorphic sitecomprises: hybridizing the nucleic acid sample onto a microarray onwhich is immobilized a polynucleotide comprising (a) at least 10contiguous nucleotides of SEQ ID NO: 23 which comprise position 101, or(b) the complement of (a); and detecting a hybridization result.
 3. Themethod of claim 1, further comprising: determining the genotype in thenucleic acid sample at the polymorphic site is TC, and determining thehuman has an increased risk of developing type II diabetes as comparedto determining the genotype is CC.